Goto

Collaborating Authors

 Querétaro


A Appendix

Neural Information Processing Systems

The complete list may be seen in Table 8. Here are a few general notes about these strings: 1. Based on their recommendations, we did the following: 1. zh, zh_Latn: This resulted in the special filters described below. URLs) the corpora were in languages different from the LangID predictions. This is mainly mis-rendered PDFs and may have practical applications for denoising, or for decoding such garbled PDFs.


Windscribe review: Despite the annoyances, it has the right idea

Engadget

The first step is always to figure out how easy or hard the VPN is to use. Windscribe and other VPNs are important tools, but you'll never use them if the UI gets in the way. I tested Windscribe's desktop apps on Windows and Mac, its mobile apps on iOS and Android and its Chrome and Firefox browser extensions. To start with, let me say that installing Windscribe is a breeze no matter where you do it. The downloaders and installers handle their own business, only requiring you to grant a few permissions. The apps arrive on your system ready to use out of the box.


Irresponsible AI: big tech's influence on AI research and associated impacts

Hernandez-Garcia, Alex, Volokhova, Alexandra, Williams, Ezekiel, Kabakibo, Dounia Shaaban

arXiv.org Artificial Intelligence

The accelerated development, deployment and adoption of artificial intelligence systems has been fuelled by the increasing involvement of big tech. This has been accompanied by increasing ethical concerns and intensified societal and environmental impacts. In this article, we review and discuss how these phenomena are deeply entangled. First, we examine the growing and disproportionate influence of big tech in AI research and argue that its drive for scaling and general-purpose systems is fundamentally at odds with the responsible, ethical, and sustainable development of AI. Second, we review key current environmental and societal negative impacts of AI and trace their connections to big tech and its underlying economic incentives. Finally, we argue that while it is important to develop technical and regulatory approaches to these challenges, these alone are insufficient to counter the distortion introduced by big tech's influence. We thus review and propose alternative strategies that build on the responsibility of implicated actors and collective action.


Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Omnilingual ASR team, null, Keren, Gil, Kozhevnikov, Artyom, Meng, Yen, Ropers, Christophe, Setzler, Matthew, Wang, Skyler, Adebara, Ife, Auli, Michael, Balioglu, Can, Chan, Kevin, Cheng, Chierh, Chuang, Joe, Droof, Caley, Duppenthaler, Mark, Duquenne, Paul-Ambroise, Erben, Alexander, Gao, Cynthia, Gonzalez, Gabriel Mejia, Lyu, Kehan, Miglani, Sagar, Pratap, Vineel, Sadagopan, Kaushik Ram, Saleem, Safiyyah, Turkatenko, Arina, Ventayol-Boada, Albert, Yong, Zheng-Xin, Chung, Yu-An, Maillard, Jean, Moritz, Rashel, Mourachko, Alexandre, Williamson, Mary, Yates, Shireen

arXiv.org Artificial Intelligence

Automatic speech recognition (ASR) has advanced in high-resource languages, but most of the world's 7,000+ languages remain unsupported, leaving thousands of long-tail languages behind. Expanding ASR coverage has been costly and limited by architectures that restrict language support, making extension inaccessible to most--all while entangled with ethical concerns when pursued without community collaboration. To transcend these limitations, we introduce Omnilingual ASR, the first large-scale ASR system designed for extensibility. Omnilingual ASR enables communities to introduce unserved languages with only a handful of data samples. It scales self-supervised pre-training to 7B parameters to learn robust speech representations and introduces an encoder-decoder architecture designed for zero-shot generalization, leveraging a LLM-inspired decoder. This capability is grounded in a massive and diverse training corpus; by combining breadth of coverage with linguistic variety, the model learns representations robust enough to adapt to unseen languages. Incorporating public resources with community-sourced recordings gathered through compensated local partnerships, Omnilingual ASR expands coverage to over 1,600 languages, the largest such effort to date--including over 500 never before served by ASR. Automatic evaluations show substantial gains over prior systems, especially in low-resource conditions, and strong generalization. We release Omnilingual ASR as a family of models, from 300M variants for low-power devices to 7B for maximum accuracy. We reflect on the ethical considerations shaping this design and conclude by discussing its societal impact. In particular, we highlight how open-sourcing models and tools can lower barriers for researchers and communities, inviting new forms of participation. Open-source artifacts are available at https://github.com/facebookresearch/omnilingual-asr.


The Feasibility of Training Sovereign Language Models in the Global South: A Study of Brazil and Mexico

Malagon, Sandra, Ruiz, Monica A. Ulloa, Plaza, Tatiana Elizabeth Sandoval, Bolívar, Gabriel Rafael Rosario, Mesa, Valentina García, Morales, Ivanna Alvarado

arXiv.org Artificial Intelligence

The rapid escalation of computational requirements for training large-scale language models has reinforced structural asymmetries between high-capacity jurisdictions and countries in the Global South. This paper examines the technical and fiscal feasibility of sovereign-scale language model training in Brazil and Mexico under conditions of constrained hardware access, energy availability, and fiscal ceilings. Using a dual-axis design that varies accelerator generation (NVIDIA H100 vs. A100) and training duration (90 vs. 150 days), we estimate compute demand, energy consumption, capital expenditures, and regulatory compatibility for the training of a 10-trillion-token model. Our findings show that while all configurations remain below export-control and electrical infrastructure thresholds, fiscal viability is determined by hardware efficiency. H100-based scenarios achieve training feasibility at a total cost of 8-14 million USD, while A100 deployments require 19-32 million USD due to higher energy and hardware demand. We argue that extending training timelines should be treated as a policy lever to mitigate hardware constraints, enabling the production of usable, auditable, and locally aligned models without competing at the global frontier. This study contributes to the discourse on AI compute governance and technological sovereignty by highlighting context-sensitive strategies that allow middle-income countries to establish sustainable and strategically sufficient AI capabilities.


A Appendix A.1 LangID Details

Neural Information Processing Systems

The complete list may be seen in Table 8. Here are a few general notes about these strings: 1. Based on their recommendations, we did the following: 1. zh, zh_Latn: This resulted in the special filters described below. URLs) the corpora were in languages different from the LangID predictions. This is mainly mis-rendered PDFs and may have practical applications for denoising, or for decoding such garbled PDFs.


A Software-Only Post-Processor for Indexed Rotary Machining on GRBL-Based CNCs

Portugal, Pedro, Venghaus, Damian D., Lopez, Diego

arXiv.org Artificial Intelligence

Affordable desktop CNC routers are common in education, prototyping, and makerspaces, but most lack a rotary axis, limiting fabrication of rotationally symmetric or multi - sided parts. Existing solutions often require hardware retrofits, alternative control lers, or commercial CAM software, raising cost and complexity. This work presents a software - only framework for indexed rotary machining on GRBL - based CNCs. A custom post - processor converts planar toolpaths into discrete rotary steps, executed through a br owser - based interface. While not equivalent to continuous 4 - axis machining, the method enables practical rotary - axis fabrication using only standard, off - the - shelf mechanics, without firmware modification. By reducing technical and financial barriers, the framework expands access to multi - axis machining in classrooms, makerspaces, and small workshops, supporting hands - on learning and rapid prototyping.

  Country:
  Genre: Research Report > New Finding (0.68)
  Industry: Education (0.93)

Efficient Adaptation of Deep Neural Networks for Semantic Segmentation in Space Applications

Olivi, Leonardo, Mormile, Edoardo Santero, Tartaglione, Enzo

arXiv.org Artificial Intelligence

In recent years, the application of Deep Learning techniques has shown remarkable success in various computer vision tasks, paving the way for their deployment in extraterrestrial exploration. Transfer learning has emerged as a powerful strategy for addressing the scarcity of labeled data in these novel environments. This paper represents one of the first efforts in evaluating the feasibility of employing adapters toward efficient transfer learning for rock segmentation in extraterrestrial landscapes, mainly focusing on lunar and martian terrains. Our work suggests that the use of adapters, strategically integrated into a pre-trained backbone model, can be successful in reducing both bandwidth and memory requirements for the target extraterrestrial device. In this study, we considered two memory-saving strategies: layer fusion (to reduce to zero the inference overhead) and an ``adapter ranking'' (to also reduce the transmission cost). Finally, we evaluate these results in terms of task performance, memory, and computation on embedded devices, evidencing trade-offs that open the road to more research in the field.


Mechanic Modeling and Nonlinear Optimal Control of Actively Articulated Suspension of Mobile Heavy-Duty Manipulators

Paz, Alvaro, Mattila, Jouni

arXiv.org Artificial Intelligence

This paper presents the analytic modeling of mobile heavy-duty manipulators with actively articulated suspension and its optimal control to maximize its static and dynamic stabilization. By adopting the screw theory formalism, we consider the suspension mechanism as a rigid multibody composed of two closed kinematic chains. This mechanical modeling allows us to compute the spatial inertial parameters of the whole platform as a function of the suspension's linear actuators through the articulated-body inertia method. Our solution enhances the computation accuracy of the wheels' reaction normal forces by providing an exact solution for the center of mass and inertia tensor of the mobile manipulator. Moreover, these inertial parameters and the normal forces are used to define metrics of both static and dynamic stability of the mobile manipulator and formulate a nonlinear programming problem that optimizes such metrics to generate an optimal stability motion that prevents the platform's overturning, such optimal position of the actuator is tracked with a state-feedback hydraulic valve control. We demonstrate our method's efficiency in terms of C++ computational speed, accuracy and performance improvement by simulating a 7 degrees-of-freedom heavy-duty parallel-serial mobile manipulator with four wheels and actively articulated suspension.


Compact Neural Network Algorithm for Electrocardiogram Classification

Frausto-Avila, Mateo, Manriquez-Amavizca, Jose Pablo, U'Ren, Alfred, Quiroz-Juarez, Mario A.

arXiv.org Machine Learning

In this paper, we present a high-performance, compact electrocardiogram (ECG)-based system for automatic classification of arrhythmias, integrating machine learning approaches to achieve robust cardiac diagnostics. Our method combines a compact artificial neural network with feature enhancement techniques, including mathematical transformations, signal analysis and data extraction algorithms, to capture both morphological and time-frequency features from ECG signals. A novel aspect of this work is the addition of 17 newly engineered features, which complement the algorithm's capability to extract significant data and physiological patterns from the ECG signal. This combination enables the classifier to detect multiple arrhythmia types, such as atrial fibrillation, sinus tachycardia, ventricular flutter, and other common arrhythmic disorders. The system achieves an accuracy of 97.36% on the MIT-BIH arrhythmia database, using a lower complexity compared to state-of-the-art models. This compact tool shows potential for clinical deployment, as well as adaptation for portable devices in long-term cardiac health monitoring applications.